# Low CER

Whisper Finetune Teochew
A Teochew (Chaoshan) orthographic recognition model fine-tuned based on Whisper-medium, supporting multi-dialect accent orthographic transcription
Speech Recognition Chinese
W
panlr
20
4
Thai Trocr
Apache-2.0
A Thai and English optical character recognition model fine-tuned from the TrOCR base handwriting model, excelling in processing handwritten text line images
Text Recognition Transformers Supports Multiple Languages
T
openthaigpt
2,677
9
Tablecell Htr
MIT
This model is designed to recognize handwritten text from text line images in table cells, particularly suitable for handwritten text recognition in Finnish death records and census records from the 1930s.
Text Recognition
T
Kansallisarkisto
39
1
Phoneme Scorer V2 Wav2vec2
Apache-2.0
An automatic speech recognition model based on Wav2Vec2-Base architecture, specifically fine-tuned for phoneme recognition on the LJSpeech Phonemes dataset
Speech Recognition Transformers English
P
ct-vikramanantha
167
9
Wav2vec2 Base Korean
Fine-tuned based on Facebook's wav2vec2-base model, specifically optimized for Korean speech recognition, and can accurately transcribe Korean speech into text.
Speech Recognition Transformers Korean
W
Kkonjeong
448
1
OCR TextInput Base
A specialized image-to-text model for the financial domain, supporting English text recognition, primarily used for processing image content in financial documents.
Text Recognition Transformers English
O
rohit5895
31
0
Pretrained Trocr Small Vietnamese Nom
A model focused on Vietnamese speech recognition, supporting high-accuracy speech-to-text conversion.
Machine Translation Transformers Other
P
nxquang-al
19
2
Image Text Captcha V2
A fine-tuned printed text recognition model based on microsoft/trocr-base-printed, mainly used for captcha recognition tasks
Text Recognition Transformers
I
dragonstar
66
3
Whisper Small Japanese
Apache-2.0
This model is a Japanese speech recognition model fine-tuned based on openai/whisper-small, supporting Japanese speech-to-text tasks.
Speech Recognition Transformers Japanese
W
Ivydata
356
5
Trocr Base Printed Fr
MIT
Transformer-based French printed text OCR model, filling the gap of French version in TrOCR models
Image-to-Text Transformers French
T
agomberto
110
2
Wav2vec2 Ljspeech Gruut
Apache-2.0
A phoneme recognition model based on the Wav2Vec2 architecture, fine-tuned on the LJSpeech Phonemes dataset, used to convert speech into phoneme sequences
Speech Recognition Transformers English
W
bookbot
2,484
17
Whisper Small Cantonese
Apache-2.0
A Cantonese speech recognition model fine-tuned based on OpenAI Whisper-small, achieving a CER of 7.93 on the Common Voice 16.0 test set
Speech Recognition Transformers Supports Multiple Languages
W
alvanlii
2,413
85
Stt Zh Conformer Transducer Large
This is a large Conformer-Transducer model for transcribing Mandarin speech, with approximately 120 million parameters, trained on the AISHELL-2 dataset.
Speech Recognition Chinese
S
nvidia
72
13
Stt Zh Citrinet 1024 Gamma 0 25
This is a non-autoregressive Citrinet model for Mandarin Chinese automatic speech recognition (ASR), with approximately 140 million parameters, using character encoding and CTC loss/decoding.
Speech Recognition Chinese
S
nvidia
92
5
Xls R 300m Et
An Estonian automatic speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m, trained with approximately 800 hours of diverse data
Speech Recognition Transformers Other
X
TalTechNLP
58
1
Bp 400h Xlsr2 300M
Apache-2.0
This is a Portuguese automatic speech recognition model trained on the Mozilla Common Voice 7.0 dataset, supporting Portuguese speech-to-text tasks.
Speech Recognition Transformers Other
B
lgris
35
2
Wav2vec2 Large Xlsr 53 Th
This is an automatic speech recognition (ASR) model fine-tuned on the Common Voice 7.0 Thai dataset based on the wav2vec2-large-xlsr-53 model.
Speech Recognition Transformers Other
W
airesearch
110.74k
21
Wav2vec2 Bn 300m
Apache-2.0
A fine-tuned Bengali automatic speech recognition model based on facebook/wav2vec2-xls-r-300m, trained using the OPENSLR_SLR53 dataset
Speech Recognition Transformers Other
W
Tahsin-Mayeesha
25
4
Wav2vec2 Xls R 300m Cs Cv8
Apache-2.0
A speech recognition model fine-tuned on the Common Voice 8.0 Czech dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers Other
W
comodoro
13
1
Wav2vec2 Large Xlsr 53 Chinese Zh Cn Gpt
Apache-2.0
A Chinese (zh-CN) speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition Transformers Chinese
W
ydshieh
127
32
Wav2vec2 Xls R 1b Ro
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the Romanian Common Voice 7.0 dataset based on facebook/wav2vec2-xls-r-1b.
Speech Recognition Transformers Other
W
ubamba98
16
0
Wav2vec2 Large Xls R 300m Bg D2
Apache-2.0
An automatic speech recognition model fine-tuned on Bulgarian language datasets based on facebook/wav2vec2-xls-r-300m
Speech Recognition Transformers Other
W
DrishtiSharma
20
1
Xls R 1b Cv 8 Fr
Apache-2.0
This is a French automatic speech recognition model fine-tuned on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - FR dataset based on facebook/wav2vec2-xls-r-1b.
Speech Recognition Transformers French
X
Plim
26
0
Wav2vec2 Xls R Sl A1
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Slovenian language (Common Voice 8.0) dataset based on facebook/wav2vec2-xls-r-300m.
Speech Recognition Transformers Other
W
DrishtiSharma
25
0
Wav2vec2 Xls R 1b Npsc Bokmaal
Apache-2.0
An automatic speech recognition model fine-tuned on the Norwegian written language (Bokmål) speech dataset based on the facebook/wav2vec2-xls-r-1b model
Speech Recognition Transformers
W
NbAiLab
23
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase